How Data Happened: A History from the Age of Reason to the Age of Algorithms

How Data Happened: A History from the Age of Reason to the Age of Algorithms

  • Downloads:9588
  • Type:Epub+TxT+PDF+Mobi
  • Create Date:2024-04-16 03:20:34
  • Update Date:2025-09-07
  • Status:finish
  • Author:Chris Wiggins
  • ISBN:1324074582
  • Environment:PC/Android/iPhone/iPad/Kindle

Summary

“Fascinating。” ―Jill Lepore, The New Yorker

A sweeping history of data and its technical, political, and ethical impact on our world。 From facial recognition―capable of checking people into flights or identifying undocumented residents―to automated decision systems that inform who gets loans and who receives bail, each of us moves through a world determined by data-empowered algorithms。 But these technologies didn’t just they are part of a history that goes back centuries, from the census enshrined in the US Constitution to the birth of eugenics in Victorian Britain to the development of Google search。 Expanding on the popular course they created at Columbia University, Chris Wiggins and Matthew L。 Jones illuminate the ways in which data has long been used as a tool and a weapon in arguing for what is true, as well as a means of rearranging or defending power。 They explore how data was created and curated, as well as how new mathematical and computational techniques developed to contend with that data serve to shape people, ideas, society, military operations, and economies。 Although technology and mathematics are at its heart, the story of data ultimately concerns an unstable game among states, corporations, and people。 How were new technical and scientific capabilities developed; who supported, advanced, or funded these capabilities or transitions; and how did they change who could do what, from what, and to whom? Wiggins and Jones focus on these questions as they trace data’s historical arc, and look to the future。 By understanding the trajectory of data―where it has been and where it might yet go―Wiggins and Jones argue that we can understand how to bend it to ends that we collectively choose, with intentionality and purpose。 8 photographs

Download

Reviews

taranis

not very good, but I can appreciate the sheer knowledge that is present here。 too bad i couldn’t digest it

James Yoon

Interesting book。 Definitely worth reading。 While the writing could have been better and less repetitive, the perspective and overview of the book was excellent。 The book does a great job explaining that data is not neutral and that there is a power-dynamic associated with data (as well as the technical, error and bias concerns)。 The book will help the reader analyze the choices and trade-offs that we will be making with AI and Data。 I learned a lot。

Levyj93

A decent overview, albeit written from the NYT “view from nowhere” … i。e。, mainstream liberalism。The footnotes are the highlight for me, as they point to many foundational primary sources of statistics, data science, computing etc。 A great number of secondary sources/commentaries on the subject matte as well。use the footnotes!

Jeremy

Read (most of) chapters 1-4, 11-12。 Kind of long winded, but some interesting historical points on how mathematical statistics is rooted in eugenics。 Would be interested in coming back to this book。

D J Rout

This book came recommended by Patrick Wyman, and Dr Wyman is a very good reseacher who cites his sources, so I thought this book woudl be written in the same way。 I'm glad to say that it is。 About 30% of the book is the list of references from each chapter。 I haven't read anything this well researched since Track Changes: A Literary History of Word Processing。Simlarly, the book is written in chronological order, tracing the origins of data collection, and the various uses it has been put to over This book came recommended by Patrick Wyman, and Dr Wyman is a very good reseacher who cites his sources, so I thought this book woudl be written in the same way。 I'm glad to say that it is。 About 30% of the book is the list of references from each chapter。 I haven't read anything this well researched since Track Changes: A Literary History of Word Processing。Simlarly, the book is written in chronological order, tracing the origins of data collection, and the various uses it has been put to over the years。 It then covers various aspects of data collection as it is currently used, delves into the ethics of privacy, the use and abuse of profiling, and government and corporate responses to the collection and use of data。 It even has a chapter which attempts to predict the future。The style is not too academic, and gets a little bit folksy towards the end, but it's always accessible and if you want to look up the more academic works it cites, there's those citations at the end。 It even quotes Jane Jacobs' The Nature of Economies, which gives it all the credibility I need。 The folksy style isn't without its problems—one phrase that turns up multiple times is 'they turned data into a thing'。 What the hell does that mean? I didn't find out in the book。If you're ever worried about how The Powers That Be are spying on you, or that your life is in the hands of people who know you better than you know yourself, or are just annoyed by having to fill in information forms just to go to the doctors, this book gives you enough history and philosophy to talk to like-minded people intelligently。 。。。more

Olivia Wesel

Violently mid, could’ve used better editing and organization。 Some good anecdotes but generally obvious

Bryan Alexander

A useful tour through the history of statistics。The one weakness is that it keeps hitting the "statistics can be biased in bad ways" theme over and over again。 Perhaps I'm too immersed in tech criticism, but it seemed overmuch。 More, the text tends to locate problems within statistics, when the causes of their bad use are usually external。Caveat aside, recommended。 A useful tour through the history of statistics。The one weakness is that it keeps hitting the "statistics can be biased in bad ways" theme over and over again。 Perhaps I'm too immersed in tech criticism, but it seemed overmuch。 More, the text tends to locate problems within statistics, when the causes of their bad use are usually external。Caveat aside, recommended。 。。。more

Zach Lewis

Interesting material, but poorly written

David

Very informative and interesting history of statistics and the rise of big data。

Relena_reads

There were points where this got too academic to be a good pop culture book, and points where it glossed too much。 It was still very interesting。

Dan

I really enjoyed listening to this audiobook, so much history that I was not aware of。 Though it is framed as a history of "data," it is really a mixture of computer science history and the history of statistics (data science being a mix of those fields, among others, and generally being a term that was invented this century)。 The authors, professors, had trouble writing for a general audience and not a textbook, some chapters were dense with jargon and difficult listening, especially the chapte I really enjoyed listening to this audiobook, so much history that I was not aware of。 Though it is framed as a history of "data," it is really a mixture of computer science history and the history of statistics (data science being a mix of those fields, among others, and generally being a term that was invented this century)。 The authors, professors, had trouble writing for a general audience and not a textbook, some chapters were dense with jargon and difficult listening, especially the chapters about alternative approaches to AI, a kind of dark winter in AI stretching from the 60s to the 80s or 90s (not sure)。 But the information was fascinating, some random facts that I'm probably remembering with partial accuracy:- the history of statistics is intertwined with the eugenics movement and based on my recollection of that section of the book, it seems modern statistics wouldn't exist if it wasn't for eugenics and racialized thinking from the early 20th century, very cringeworthy stuff there!- statistics started in astronomy, growing out of the need to aggregate many observations of celestial bodies, and the inherent errors caused my subpar instruments, to get closer to the "real" data。 Later, statistics were developed using similar techniques, to measure abstractions。 If you average out dozens of measurements of a star's movement, you are getting closer to something real, some phenomenon that objectively exists。 If you average out the heights of all the males in your state, you are measuring something that doesn't really exist, an abstract concept called "average height。" So basic that I'm embarrassed to say that I hadn't really thought about that much, probably since I took graduate level statistics。- Artificial Intelligence, or AI, especially the flavor of it found in data science today, is very old and many of the techniques were developed, or at least early conceptions of them, in the 40s and 50s。 In some ways the groundwork for ChatGPT and such was created in the dawn of computing。 The thing that changed, in my simplified recollection of the book, is that computing power increased and humans can now process a lot more data, making those techniques more powerful。 But it is not like humans have invented new techniques。The book had very good sections on the ethics of data, how the US punted on opportunities to grapple with questions of who owns data about citizens in the 70s and really never looked back。 They make the point that today's world, where huge tech companies use our data to make money via advertisers and deliver us "free" services, was not inevitable but rather a policy choices, or numerous choices and non-choices over decades。The last point I want to mention, a political and cultural observation, is the ridiculousness of the myth of Silicon Valley - the all powerful tech demi-gods who, in their genius, have created these amazing companies from whole cloth, the great inventors of Facebook, Apple, and Google。 What isn't typically included in that myth, is that the US government, and mostly the defense and intelligence industry, poured gargantuan amounts of money, year after year, decade after decade, from WWII up until present day, to develop the computing power, the statistical approaches used today, and many many approaches that didn't succeed (paths that our Silicon Valley friends didn't have to stumble down)。 How rich, pun intended, it is to hear the likes of Elon Musk, Peter Thiel, and others of their ilk complain about taxes, government regulation, deficits, etc。 when all their riches are deeply indebted to government largess? Rich indeed。 。。。more

David Childers

This book is a somewhat disjointed attempt to give a history of modern data analysis and its' role in society, describing a collection of different communities which offered approaches to data analysis and trying to connect them through common themes, particularly the relation of data to state and corporate power and objectives。 It begins with a fairly standard history of early statistics and statisticians that formed the core of a modern intro to statistics class (means, variances, regression, This book is a somewhat disjointed attempt to give a history of modern data analysis and its' role in society, describing a collection of different communities which offered approaches to data analysis and trying to connect them through common themes, particularly the relation of data to state and corporate power and objectives。 It begins with a fairly standard history of early statistics and statisticians that formed the core of a modern intro to statistics class (means, variances, regression, and tests through Quetelet, Galton, Fisher, Gossett, Neyman, Pearson, and Mahalanobis), though with a more pointed discussion of social context than usually offered。 Certainly, e。g。, Galton and his contemporaries were not just "incidentally" racist but specifically developed statistical methodology primarily for the promotion of eugenics and the goal of "scientifically" demonstrating that some people are superior to others。 They trace the rise of mathematical statistics to WWII, with Abraham Wald in the US going on to spread a mathematicized discipline with encouragement from US government funding, but have little to say about the actual content or subsequent theoretical developments, which they mostly dismiss as abstruse and not useful。 They cover Bayesian statistics largely through the codebreaking work of Turing and Good at Bletchley park and subsequent interest from the intelligence community。 Here and in later sections in which they emphasize classified work and the military industrial complex, the textual evidence for the external impact becomes, perhaps necessarily more meager。 One could certainly tell stories about the links, like the origins of MCMC in the Manhattan project, but they don't have much to say about academic and applied Bayesian statistics outside of government, let alone its origins。 In the later part, they cover a variety of communities with less historical link to statistics as a discipline, including a standard history of Artificial Intelligence starting with McCarthy and the Dartmouth conference, through expert systems, the kernel era, up through contemporary deep learning, told primarily as a story about military and intelligence agency funding priorities。 For the near-contemporary era they emphasize the applied and computational turn in the rise of data science, and the fields of databases and knowledge mining up through the era of the internet and big data; for these topics they divide the discussion of influences between military and corporate, emphasizing private surveillance and advertising。 It ends with a discussion of data ethics, both as an area of contestation and practice, and as a reminder of the ways in which social forces shape practice。Overall, the picture presented in this book is more thematic than narrative, a discussion of the material as opposed to the intellectual origins of data。 The book starts with contemporary critiques of data collection and analysis as used by prominent large companies and then seeks to reinforce the critiques by searching back in time for analogous historical (mis)uses。 This enhances the point but is paradoxically enervating, as portraying these issues as constant companions of data analysis makes the contemporary problems seem both less severe and less tractable。 。。。more

Tej

4。4375

Yoo Chung

In large part, this is a history of how data collection, statistics, and data analysis developed from their humble beginnings as in ancient times to today's colossal computing infrastructure。 The last part is rather disappointing, however, in that it feels mostly like a laundry list of criticisms about modern data collection and usage, especially with its almost complete focus on large tech companies that are already well known and not much about the much wider universe of data today。 In large part, this is a history of how data collection, statistics, and data analysis developed from their humble beginnings as in ancient times to today's colossal computing infrastructure。 The last part is rather disappointing, however, in that it feels mostly like a laundry list of criticisms about modern data collection and usage, especially with its almost complete focus on large tech companies that are already well known and not much about the much wider universe of data today。 。。。more

Craig Fiebig

Decent book on the history of the use (and abuse) of data。

Christopher

Wow。 Super interesting book。 Had to put it down a few times to think about it。 Worth reading if you work with/in data, or if you live in a world that is determined in part or entirely by data and what people do with it。

John Lim

Well-researched, although in need of some serious copyediting。 Lacks a lot of dates where they'd be helpful, block quote formatting for lines that aren't actually quotes, repetitive at times。 I'm looking forward to the second edition coming out next year, which will hopefully make this as compelling and accessible as it deserves to be。 Well-researched, although in need of some serious copyediting。 Lacks a lot of dates where they'd be helpful, block quote formatting for lines that aren't actually quotes, repetitive at times。 I'm looking forward to the second edition coming out next year, which will hopefully make this as compelling and accessible as it deserves to be。 。。。more

CatReader

3 stars。 Interesting content and very ambitious in scope (this book aims to summarize the entire field of data science!!), but hard to digest。 The audiobook version clocks in at 10 hours but it took me significantly longer than usual to finish given how dense and often repetitive the writing is。 I found myself repeatedly stopping after a few minutes due to tediousness or boredom。Further reading (many of these topics are covered in blurb-like fashion in this work, so I would recommend supplementa 3 stars。 Interesting content and very ambitious in scope (this book aims to summarize the entire field of data science!!), but hard to digest。 The audiobook version clocks in at 10 hours but it took me significantly longer than usual to finish given how dense and often repetitive the writing is。 I found myself repeatedly stopping after a few minutes due to tediousness or boredom。Further reading (many of these topics are covered in blurb-like fashion in this work, so I would recommend supplemental reading if any topics in this book piqued your interest):Historical context:The Secret Lives of Codebreakers: The Men and Women who Cracked the Enigma Code at Bletchley Park by Sinclair McKay (2010)Proving Ground: The Untold Story of the Six Women Who Programmed the World’s First Modern Computer by Kathy Kleiman (2022)The Idea Factory: Bell Labs and the Great Age of American Innovation by Jon Gertner (2012)Current topics:The Data Detective: Ten Easy Rules to Make Sense of Statistics by Tim Harford (2020)Invisible Women: Data Bias in a World Designed for Men by Caroline Criado Perez (2019)The Myth of Artificial Intelligence: Why Computers Can't Think the Way We Do by Erik Larson (2021)The Big Nine: How the Tech Titans and Their Thinking Machines Could Warp Humanity by Amy Webb (2019) 。。。more

Jenn Adams

How did we get to the point where are at where "data" (such a general term) and algorithms are so integral to day-to-day existence? I liked how this book took us all the way back to explore the historical foundation and made the topic approachable for a general audience。4。5 How did we get to the point where are at where "data" (such a general term) and algorithms are so integral to day-to-day existence? I liked how this book took us all the way back to explore the historical foundation and made the topic approachable for a general audience。4。5 。。。more

Shana Yates

3。5 stars

Davíð Björnsson

Interesting subject matter but poorly written。

Alejandro Jaramillo

An interesting and fairly quick review of key historical points in the development of statistics, data science, and modern data infrastructures and economies。 Necessarily sacrifices breadth for depth, but does so selectively, choosing one major “shift” to focus on in each chapter。 I especially appreciated how thoughtfully they highlighted epistemological battles dating back to the 18th century about how statistics should be used and what we can verifiably “know” using the methdologies that would An interesting and fairly quick review of key historical points in the development of statistics, data science, and modern data infrastructures and economies。 Necessarily sacrifices breadth for depth, but does so selectively, choosing one major “shift” to focus on in each chapter。 I especially appreciated how thoughtfully they highlighted epistemological battles dating back to the 18th century about how statistics should be used and what we can verifiably “know” using the methdologies that would eventually undergird data science。 I also want to commend the authors for doing their due diligence in elevating other critical data scholars (especially women of color), who have made substantial contributions to the field。The book was unfortunately riddled with grammatical errors (which did interfere with my comprehension on occasion)。 。。。more

Florent Diverchy

Extremely interesting history of data, from the origins of state-istic (science of the state) to generative AI。 Great research work, with the usual milestones: IQ & indian casts, p-value & Guiness beer, Tuskegee & consent, 。。。 Also a great reminder that AI was rule based (and so explainable) for decades, before being overuled by statistical AI。A 4-star and not a 5-star because the last chapter is, according to me, a bit weak and boring to read。

Kenny

Read for Data Science book club。 Really ambitious book and seems like maybe it bit off more than it could chew。 Definitely learned some things about the development of data science and forces behind the development, but organization of book was a little awkward at times。

Johanne

DNFI made it to Chapter 6 which had a good start but the Bayesian principle did me in。 Who is the audience for this book? Statisticians know all this already and it way too involved for the simply curious to learn more like me。

Russ

Reads like a text book, or rather more likely a book slapped-together from teachers’ notes after the course was completed。 Poorly written, carelessly referenced, unintelligible to outsiders of the data analysis field。 Too bad because the history of data - how we got to where we are today with unregulated big data collection, surveillance technology, automated disinformation, deep fakes and fake news, undisclosed algorithms and undecipherable AI - is important。 Unfortunately, this book is not wel Reads like a text book, or rather more likely a book slapped-together from teachers’ notes after the course was completed。 Poorly written, carelessly referenced, unintelligible to outsiders of the data analysis field。 Too bad because the history of data - how we got to where we are today with unregulated big data collection, surveillance technology, automated disinformation, deep fakes and fake news, undisclosed algorithms and undecipherable AI - is important。 Unfortunately, this book is not well directed to the curious, but uninitiated reader like me。 Perhaps I’d be better served by taking Profs Wiggins’ and Jones’ class at Columbia, with interactive discussions and Q&A。 。。。more

Iain Bertram

Get the print version。 The functions do not translate well to audio and the narrator doesn't know how to speak maths。 Get the print version。 The functions do not translate well to audio and the narrator doesn't know how to speak maths。 。。。more

David Hackett

Well written but at times it reads more like a text book。 It is a good survey of the history of data, better than other similar history books I’ve read but far from the most incisive on the topic of data and artificial intelligence。 Worth it if you have more than just a casual interest in the history of stats, data, and data science。

Amanda

DNF。 The writing and organization really interfered with what could be a fascinating subject。

Oscar

A non-fiction book looking at the history of data。 I had heard about it from a Tides of History episode interviewing the authors。 The book loosely covers several historical periods and looks at key figures in the history of data。 It covers early pioneers in the statistical movement in the industrial era, the use of statistics in the enigma machine, and the development of data tools in the postwar period into the information era。The book does provide some background, although there is a lack of c A non-fiction book looking at the history of data。 I had heard about it from a Tides of History episode interviewing the authors。 The book loosely covers several historical periods and looks at key figures in the history of data。 It covers early pioneers in the statistical movement in the industrial era, the use of statistics in the enigma machine, and the development of data tools in the postwar period into the information era。The book does provide some background, although there is a lack of contextualisation and analysis。 For example, the first chapter covers Francis Galton and his development of statistics to support his eugenic ideals。 Whilst this in itself was interesting as the origin of the bell curve in statistics, there was little done to discuss or critically appraise his use of the bell curve beyond saying that he misused it。 Unfortunately, this is a running theme throughout the book, with plenty of narrative, but little analysis and synthesis。 The book also moves from fact to fact in a running narrative, rather than giving some time for the reader to process points。 Even a bit more time spent examining why techniques were useful and some actual details would have been helpful。 For example, in the chapter covering the use of techniques by Guiness to improve beer outputs, it would have been helpful to know how practice changed when the data poor environment was configured and changed。The later chapters looking at the modern use felt like totally uncovered ground to me, and these were more interesting。 However, they were hampered again by the lack of elaboration and explanation。 Which, whilst admittedly may not be wholly possible with the lack of available information in recent events, could have helped me to at least understand, for example, why the ethics board at Google failed。Overall, I felt this book had promise but it needed editing into tighter narratives focused on concepts, rather than the reciting of events。 。。。more

Tag

    how data happened a history from the age of reason to the age of algorithms how data happened a history from the age of reason to the age of algorithms pdf how data happened a history from the age of reason to the age of algorithms